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A NOTE ON KING'S ARTICLE ON "THE CORRELATION OF 

HISTORICAL ECONOMIC VARIABLES AND THE MISUSE 
OF COEFFICIENTS IN THIS CONNECTION." 

Errors in the conception of correlation and the use of the correlation 
coefficient are so common a vice of pseudo-scientific research that any 
attempt to correct their influence would be a bootless task. Mistakes of 
this sort mark the beginnings of any newer branch of scientific knowledge 
and may be looked upon as natural growing pains. Were Dr. King's 
article in the December number of the Quarterly Publications the first 
fruit of an immature worker in an uncultivated field, it might be allowed to 
pass without comment. But Dr. King's standing among American statis- 
ticians and the appearance of his article in the publications of the American 
Statistical Association place the matter in a different light, and justify an 
explicit challenge to the contentions he sets forth. 

The gist of Dr. King's argument runs as follows: (1) the correlation 
coefficient is being employed to demonstrate utterly fallacious conclusions, 
especially in the analysis of historical data; (2) such errors rest in part 
upon neglect of the fact that correlation always indicates causation; (3) 
in part the errors are due to failure to separate adequately the forces con- 
tributing to the given events; (4) if causal forces are thoroughly isolated, 
correlation in every case will be perfect; (5) for demonstrating whether 
correlation does or does not exist, "no coefficient equals the graphic 
method"; (6) in the study of historical economic variables, the correlation 
coefficient serves but one purpose — namely, to measure the lag, if there be 
any. 

One need not question Dr. King's contention that the correlation co- 
efficient has been too often employed to support conclusions which have 
been "utterly fallacious and entirely contrary to the facts. " Few statistical 
devices are "foolproof," and the more delicate the device the more liable it 
is to misuse by the inexpert. Dr. King's indictment, however, might well 
have been made more specific. Blundering by the novice is one thing; 
error by the scientist is another. One can not but feel that much of Dr. 
King's bill of complaint is irrelevant to the work of statisticians of meas- 
urable professional standing. If this be the case, the fact might well have 
been made clear. 

But it is in his analysis of correlation and the correlation coefficient that 
Dr. King is most unfortunate. His discussion of the nature of correlation 
is so extreme and inflexible as to be wholly misleading. To Dr. King, corre- 
lation invariably connotes established causal relationship. Since "every 
cause must produce its effect exactly and invariably, " it follows that "there 
can be no such thing as imperfect or partial correlation." "Every correla- 
tion to exist must be perfect." Correlation coefficients less than unity 
indicate that cause and effect have not been completely isolated; they 
evidence the confusion or deficiency of our measurements, not any imperfec- 
tion of the correlation itself. 

If correlation were being defined de novo, it is possible that this view 
might prevail. But the term "correlation" has an established meaning. 
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Correlation is not identical with causation, though closely connected with it. 
Correlation connotes a tendency toward persistent association. In other 
words correlation is "a tendency toward concomitant variation." As 
Bowley states it (Elements of Statistics, p. 316): "When two quantities 
are so related that the fluctuations in one are in sympathy with the fluctua- 
tions of the other, so that an increase or decrease of one ip found in connec- 
tion with an increase or decrease (or inversely) of the other, and the greater 
the magnitude of the change in the one, the greater the magnitude of the 
change in the other, the quantities are said to be correlated. " By estab- 
lished usage and authority, correlation is nothing more than "one-to- 
one " correspondence in paired items of selected variables. 

If this is the nature of correlation, what is its relation to causation? 
The case of perfect correlation may be first considered. Perfect correlation 
is a theoretical or conceptual limit to the more or less imperfect correlations 
of the world of experience.* Actual correlations, nevertheless, sometimes 
approach perfect correlation so closely as to share its nature. Such approx- 
imately-perfect correlations afford indisputable evidence of (1) joint char- 
acteristics of single entities (e. g., the height and weight of adult males), or 
(2) that consistent routine of events which we designate causal connection. 
Complete correlation in the first of these cases informs us that we may 
expect to find invariably associated with a given characteristic of an object 
a stated measure of a second characteristic. Complete correlation in the 
second case informs us that we may expect to find a given event — designated 
cause — invariably followed by a second — designated effect, t 

If correlation were always perfect, the problem would be relai^vely simple. 
Unfortunately for Dr. King's exposition, data drawn from experience — 
even artificially controlled experience — never exhibit perfect correlation. 
The essential task of correlation studies lies in the interpretation of concrete 
evidence, not in the understanding of conceptual limits. In general the 
question raised by actual data is a question of probability: Do the data 
show a degree of correspondence greater than was to have been expected 
from chance? If not, there is no evidence of a tendency toward persistent 
association, and hence no evidence of correlation. If the degree of corre- 
spondence, however, exceeds that which was to have been expected from 
chance, correlation may be postulated with a confidence varying directly 
as the excess. 

It is clear enough that, when cause and effect are absolutely segregated, 
correlation becomes complete and obvious. Such complete — or virtually 
complete — segregation of forces is the ideal of causal analysis. Dr. King 
cites three reasons for the failure of economic analysis to attain this ideal: 
(1) failure to comprehend the factors involved; (2) lack of information 
essential to the separation of the forces; (3) ignorance of proper statistical 
methods. Unfortunately he neglects the reason which is most funda- 

* An analogous statement may be made, of course, of perfect independence. 

t Conceivably the joint characteristics of the first case may be thought of as coupled effects of a common 
cause; thus bringing the case within the class of causal connections. But ordinarily nothing is to be gained 
by this view of the matter. Whether or not the two cases at bottom are alike causal, their separation 
in practice is helpful. 
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mental: the practical impossibility of isolating economic forces completely. 
Dr. King, in suggesting the complete segregation of forces in economic 
analysis, is not merely giving a counsel of perfection; he is proposing what 
is manifestly impossible. The events of economic life are the product of 
an intricate, complex, maelstrom of forces. Economic science is replete 
with examples of the misfortunes attending attempts to abstract influences 
from this social complex: what the analysis gains in simplicity, it loses in 
reality and applicability. Segregation of forces in economic analysis must 
be effected with the utmost care, and can never be more than partial. 

Dr. King makes much of a parallel between the methods of economic and 
physical science. Unquestionably economic research might profit greatly 
from wider and more intimate acquaintance with the methods of the more 
exact sciences. But a warning should be given against the tendency of 
making research in social science an exact and mechanical process. It is 
unsafe to conduct economic studies with mathematical automatons. In 
the first place, in most cases the original measurements are not sufficiently 
accurate to justify refined methods of analysis. And secondly, the analysis 
itself is a succession of acts of selection and interpretation calling for the 
highest order of intelligence. If left to itself, statistical method soon runs 
amuck. If it is to be successful, there must be wise and steadying guidance. 
Nothing is more mistaken than the widely-held notion that statistical 
method, because of the precision of its mathematical processes, yields per- 
fectly dependable results. It is to be hoped that statistical method will 
make the hypotheses of economic science somewhat more trustworthy; for 
with the aid of this method presumptions of causal connection may surely 
be more firmly established. But it is one of the unavoidable limitations 
of economic science that its laws must remain openly in the presumptive 
form. Fortunately the laws may be made to serve none the less as satis- 
factory working hypotheses for private and public policy. 

Nothing that has been said should be construed as in opposition to as 
thorough a separation of forces as is practicable in economic analysis. In 
the study of time series, secular trend, cyclical movements, and seasonal 
fluctuations should be carefully distinguished. Data should be selected 
with a view to their "greatest possible homogeneity." Studies of correla- 
tion should be supplemented with studies of "partial correlation." Un- 
fortunately these latter practices, involving as they do a restriction of the 
" universe of discourse," sacrifice wealth of material to uniformity of data, 
and by basing conclusions necessarily upon a smaller body of evidence in- 
crease the probable error of the result. When all is said and done, a large 
margin of uncertainty thus is bound to persist in economic science. Much 
more is to be gained by acknowledging this fully than by underrating its 
significance. 

Dr. King's argument leads him to discount the use of the correlation 
coefficient. If all correlations were perfect, he could well afford to do so; 
perfect correlation, if actually encountered, would be simple enough to deal 
with. It is just because correlation is never perfect that the coefficient is 
invaluable. As has been stated, the significance of a given association of 
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paired variables depends entirely upon the extent to which the association 
exceeds that to be expected from chance. The correlation coefficient 
enables an exact measurement of this all-important relationship. For this 
purpose curves are worthless. Curves may yield valuable suggestions for 
further study; they may prove effective graphic representations of correla- 
tion. In general they are a treacherous instrument for proving correlation. 
For this purpose nothing equals the correlation coefficient. 

In the analysis of time variables, Dr. King would limit the use of the 
coefficient to the measurement of lag. Nothing could be more ill-advised. 
It is true that the correlation coefficient in its original form was not adapted 
to the study of fluctuations in time. But the "method of differences" 
corrects this deficiency and restores the coefficient to its position as the most 
serviceable measure of correlation, for historical as for other variables. 

It is difficult to say to what extent the defects of Dr. King's article arise 
from mere excess of statement in giving a warning, the need of which all 
might concede. Dr. King's contributions to the literature of statistics are 
a sufficient guarantee that we need fear no abuse of statistical method at his 
hands. Unfortunately his December article gives no equal assurance for 
others. Upon the contrary there is reason to believe that it will add 
materially to that confusion regarding the nature of correlation and the 
correlation coefficient which it has been Dr. King's intention to correct. 

Edmund E. Day. 



